# Low-memory inference

Smollm 135M Instruct
Apache-2.0
A lightweight instruction fine-tuned language model optimized for mobile deployment
Large Language Model
S
litert-community
131
1
Openhands Lm 7b V0.1 GGUF
MIT
OpenHands LM is an open-source coding model built on Qwen Coder 2.5 Instruct 32B, which performs excellently in software engineering tasks through special fine-tuning.
Large Language Model English
O
Mungert
1,131
2
Falcon E 3B Instruct
Other
Falcon-E-3B-Instruct is an efficient language model based on a 1.58-bit architecture, optimized for edge devices, with excellent inference capabilities and low memory usage.
Large Language Model Transformers
F
tiiuae
225
22
Falcon E 1B Instruct
Other
Falcon-E-1B-Instruct is an efficient language model based on a 1.58-bit architecture, optimized for edge devices with low memory footprint and high performance.
Large Language Model Transformers
F
tiiuae
87
7
All MiniLM L6 V2 GGUF
Apache-2.0
all-MiniLM-L6-v2 is a compact and efficient sentence embedding model based on the MiniLM architecture, suitable for sentence similarity computation and feature extraction tasks.
Text Embedding English
A
Mungert
1,094
2
Meta Llama 3 8B Instruct GGUF
An IQ-DynamicGate ultra-low-bit quantization (1-2 bit) model based on Llama-3-8B-Instruct, utilizing precision-adaptive quantization technology to enhance inference accuracy while maintaining extreme memory efficiency.
Large Language Model English
M
Mungert
1,343
3
Smolvlm2 2.2B Instruct
Apache-2.0
SmolVLM2-2.2B is a lightweight multimodal model designed for analyzing video content. It can process video, image, and text inputs and generate text outputs.
Image-to-Text Transformers English
S
HuggingFaceTB
62.56k
164
Mosaicml Mpt 7b Chat Bnb 4bit Smashed
A compressed version of the MPT-7B-Chat model provided by PrunaAI, optimized with llm-int8 technology to significantly reduce memory usage and energy consumption.
Large Language Model Transformers Other
M
PrunaAI
30
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase